Sparse Dynamic Programming for Longest Common Subsequence from Fragments
نویسندگان
چکیده
Sparse Dynamic Programming has emerged as an essential tool for the design of efficient algorithms for optimization problems coming from such diverse areas as computer science, computational biology, and speech recognition. We provide a new sparse dynamic programming technique that extends the Hunt–Szymanski paradigm for the computation of the longest common subsequence (LCS) and apply it to solve the LCS from Fragments problem: given a pair of strings X and Y (of length n and m, respectively) and a set M of matching substrings of X and Y , find the longest common subsequence based only on the symbol correspondences induced by the substrings. This problem arises in an application to analysis of software systems. Our algorithm solves the problem in O M log M time using balanced trees, or O M log logmin M nm/ M time using Johnson’s version of Flat Trees. These bounds apply for two cost measures. The algorithm can also be adapted to finding the usual LCS in O m + n log + M log M time using balanced trees or O m+ n log + M log logmin M nm/ M time using Johnson’s version of Flat Trees, where M is the set of maximal matches between substrings of X and
منابع مشابه
Longest Common Subsequence from Fragmentsvia Sparse Dynamic
Sparse Dynamic Programming has emerged as an essential tool for the design of eecient algorithms for optimization problems coming from such diverse areas as Computer Science, Computational Biology and Speech Recognition 7, 11, 15]. We provide a new Sparse Dynamic Programming technique that extends the Hunt-Szymanski 2, 9, 8] paradigm for the computation of the Longest Common Subsequence (LCS) a...
متن کاملEfficient algorithms for the longest common subsequence in $k$-length substrings
Finding the longest common subsequence in k-length substrings (LCSk) is a recently proposed problem motivated by computational biology. This is a generalization of the well-known LCS problem in which matching symbols from two sequences A and B are replaced with matching non-overlapping substrings of length k from A and B. We propose several algorithms for LCSk, being non-trivial incarnations of...
متن کاملA Specialized Branching and Fathoming Technique for The Longest Common Subsequence Problem
Given a set S = {S1, ..., Sk} of finite strings, the k-longest common subsequence problem (k-LCSP) seeks a string L of maximum length such that L is a subsequence of each Si for i = 1, ..., k. This paper presents a technique, specialized branching, that solves k-LCSP. Specialized branching combines the benefits of both dynamic programming and branch and bound to reduce the search space. For la...
متن کاملNew Tabulation and Sparse Dynamic Programming Based Techniques for Sequence Similarity Problems
Calculating the length of a longest common subsequence (LCS) of two strings A and B of length n andm is a classic research topic, with many worst-case oriented results known. We present two algorithms for LCS length calculation with respectively O(mn log log n/ log n) and O(mn/ log n+r) time complexity, the latter working for r = o(mn/(log n log log n)), where r is the number of matches in the ...
متن کاملA simple algorithm for the constrained sequence problems
In this paper we address the constrained longest common subsequence problem. Given two sequences X , Y and a constrained sequence P , a sequence Z is a constrained longest common subsequence for X and Y with respect to P if Z is the longest subsequence of X and Y such that P is a subsequence of Z. Recently, Tsai [7] proposed an O(n ·m · r) time algorithm to solve this problem using dynamic prog...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Algorithms
دوره 42 شماره
صفحات -
تاریخ انتشار 2002